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Abstract — CDNs have been widely used to provide low latency, scala- 
bility, fault tolerance, and load balancing for the delivery of web content 
and more recently streaming media. We propose a system that improves 
the performance of streaming media CDNs by exploiting the path diver- 
sity provided by existing CDN infrastructure. Path diversity is provided by 
the different network paths that exist between a client and its nearby edge 
servers; and multiple description (MD) coding is coupled with this path di- 
versity to provide resilience to losses. In our system, MD coding is used to 
code a media stream into multiple complementary descriptions, which are 
distributed across the edge servers in the CDN. When a client requests a 
media stream, it is directed to multiple nearby servers which host comple- 
mentary descriptions. These servers simultaneously stream these comple- 
mentary descriptions to the client over different network paths. 

This paper provides distortion models for MDC video and conventional 
video. We use these models to select the optimal pair of servers with com- 
plementary descriptions for each client while accounting for path lengths 
and path jointness and disjointness. We also use these models to evaluate 
the performance of MD streaming over CDNs in a number of real and gen- 
erated network topologies. Our results show that distortion reduction by 
about 20 to 40% can be realized even when the underlying CDN is not de- 
signed with MDC streaming in mind. Also, for certain topologies, MDC 
requires about 50% fewer CDN servers than conventional streaming tech- 
niques to achieve the same distortion at the clients. 

I. Introduction 

CONTENT delivery networks (CDNs) were developed to 
overcome performance problems, such as network conges- 
tion and server overload, that arise when many users access pop- 
ular content. CDNs improve end-user performance by caching 
popular content on edge servers located closer to users. This 
provides a number of advantages. First, it helps prevent server 
overload, since the replicated content can be delivered to users 
from edge servers. Furthermore, since content is delivered from 
the closest edge server and not from the origin server, the content 
is sent over a shorter network path, thus reducing the request re- 
sponse time, the probability of packet loss, and the total network 
resource usage. While CDNs were originally intended for static 
web content, recently, they have been applied to the delivery of 
streaming media as well. 

Streaming media is characterized by data that has a strict de- 
lay constraint. This delay constraint makes streaming media 
very sensitive to packet loss and network outages. For exam- 
ple, when receiving a streaming media session, data that arrives 
late is useless. Not only does streaming media suffer from the 
same problems associated with static content delivery, it also 
presents additional challenges due to the real-time nature of the 
content. Conventional approaches for dealing with packet loss 
for static data, such as retransmissions, may not be possible in a 
streaming context. Thus, additional mechanisms are needed to 
provide streaming media delivery over packet networks. 

Of the various techniques to improve streaming media qual- 
ity, a method of multiple description coding (MDC) with path 
diversity was proposed in [1]. MDC codes a media stream into 
two (or more) complementary descriptions. These descriptions 
have the property that if either description is received it can be 



used to decode baseline quality video, and both descriptions can 
be used to decode improved quality video. This is in contrast 
to conventional video coders (e.g. MPEG- 1/2/4, H.261/3, Mi- 
crosoft's and Real Networks 's proprietary coders), which pro- 
duce a single stream that does not have these MD properties; we 
refer to these methods as single description coding (SDC). 

MDC combines particularly well with path diversity, in which 
the different descriptions are explicitly sent over different routes 
to a client. Path diversity exploits the fact that while any network 
link may suffer from packet loss, there is a much smaller chance 
that two network paths simultaneously suffer from losses. In 
other words, losses on the two paths are likely to be uncorre- 
lated. MDC combined with path diversity is beneficial for delay- 
sensitive, real-time applications such as streaming media, where 
data losses, especially consecutive ones, are highly disruptive to 
the application. In prior work [1], path diversity was achieved 
using either a relay infrastructure or source-based routing. 

In this work, we achieve error resilient media streaming by 
using MDC and leveraging CDN infrastructure to provide path 
diversity. We use MDC to code a media stream into multiple de- 
scriptions, and distribute copies of these descriptions across sur- 
rogates in the CDN. When a client requests a media stream, it is 
directed to multiple nearby surrogates which host complemen- 
tary descriptions of the stream. The client simultaneously re- 
ceives the different descriptions through different network paths 
from the different surrogates. That is, we leverage the existing 
CDN infrastructure to achieve path diversity between multiple 
surrogates and the client. In this way, disruption in streaming 
media occurs only in the less likely case when simultaneous 
losses afflict both paths. This architecture also reaps the benefits 
associated with CDNs, such as reduced response time to clients, 
load balancing across servers, robustness to network and server 
failures, and scalability to number of clients. 

This paper continues in Section II by describing how CDNs 
can be used to achieve path diversity. Section III discusses ar- 
chitecture design issues that arise in using MDC in CDNs. Sec- 
tion IV characterizes the performance of MDC when used with 
path diversity. Section V presents simulation results on MD- 
CDN performance for various network topologies. Section VI 
mentions additional related work. Section VII concludes with a 
brief summary. 

II. Path Diversity in CDN 

Diversity schemes, such as frequency, time, and spatial diver- 
sity, have been widely employed to improve system reliability 
in wireless communications [2]. In wired networks, only time 
diversity or interleaving is typically exploited due to the lack of 
infrastructure support for path diversity. However, the benefits 
of path diversity can be significant due to the potentially highly 
variable nature of the quality of each individual path [3], and 
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the often failure in identifying the single best path [4]. IP source 
routing is one possible mechanism to achieve path diversity but 
is not widely supported. The advent of the CDN provides a 
new platform under which path diversity can be realized with- 
out resorting to explicit path-diversity mechanisms [1], [5]. By 
virtue of having the original content replicated at multiple ge- 
ographically or topologically separated surrogates, a CDN pro- 
vides a client multiple paths of different characteristics to access 
the same content. 




Fig. 1. Exploiting path diversity of a CDN (left) and abstraction of the two-path 
path diversity between a client and two surrogates within a CDN (right) 

Fig. 1 illustrates how path diversity offered by a CDN is typ- 
ically exploited to provide fault tolerance when only a single 
path between a client and a surrogate is used. Normally, a client 
indicated by the circle communicates with the closest surrogate 

51 via path Pi. In the case when link L\ goes down or 5i is 
overloaded, the client can be redirected to an alternate surrogate 

52 and obtain the same content from S2 via path P2. Simi- 
larly, if link L2 goes down or 52 is overloaded, the client can 
again be redirected to an alternate surrogate 53 and obtain the 
same content from 5 3 through a longer path P 3 . Ideally, the best 
achievable reception for a single path is attained when the client 
instantaneously switch to the best path. However, due to the 
lack of a priori information to facilitate instantaneous switch- 
ing, reactive switching is used. As a result, the applicability of 
the scheme is restricted to more persistent network impairments, 
such as network outages, and is largely ineffective against tran- 
sient network losses. 

We propose an additional way to exploit path diversity pro- 
vided by CDNs by enabling simultaneous communication be- 
tween a client and multiple surrogates. For instance, the client 
in Fig. 1 can obtain half of the content from Si using path Pi, 
and the other half from 52 via P2 . When link Li goes down, the 
client can reactively switch to using 52 and S3 through paths P 2 
and P3, respectively, achieving fault tolerance. One key advan- 
tage of simultaneously using multiple paths is the reduction in 
the probability of simultaneous, correlated loss in all paths, re- 
gardless of the loss characteristics of individual links. Whether 
such a feature can be translated into benefits for video applica- 
tions depends on our ability to exploit it. In Section IV, we will 
describe MDC and its relevant characteristics for transmission 
using multiple paths. 

Of course, the use of multiple paths does not guarantee inde- 
pendence of the paths. Generally, parts of the paths may be dis- 
joint while other parts may overlap. For instance, when paths Pi 
and P2 in Fig. 1 are used, the links L2 and L3 are shared while 
links Li and L4 are not. When losses occur in either link Li or 
£4, only one path is affected. On the other hand, if losses occur 
in either link L2 or L3, both paths are affected. Thus, the ad- 
vantage of having multiple paths depends on the characteristics 
of the "joint" part of the paths. If most of the losses occur in the 
joint part of the paths, there is little advantage in using multiple 



paths. Conversely, if the "joint" part has relatively little losses, 
then the benefit of using multiple paths is enhanced. If paths P2 
and P 3 of Fig. 1 are used instead of paths Pi and P2, the num- 
ber of joint links is decreased from two to one, which may be 
preferrable despite the fact that P 3 is longer than Pi. The im- 
pact of having losses in the joint and disjoint parts of each path 
are examined in the context of MDC and path diversity in Sec- 
tion IV. We also examine the joint/disjoint path charcteristics 
for real and generated network topologies in Section V. 

III. MD-CDN Architecture Design 

This section discusses the architectural design issues that 
arise when using CDN's for delivering MD coded content. We 
refer to such a system as a Multiple Description CDN (MD- 
CDN). Some of these issues are also found in a traditional 
streaming CDN, which we refer to as Single Description CDN 
(SD-CDN), but require alternative solutions to optimize for the 
MD case. In the case of MD streaming within an existing CDN 
infrastructure, the design issues that arise include (1) how to dis- 
tribute the MD streams across the existing surrogates, a process 
which we refer to as MD distribution across surrogates, and (2) 
how to select for each client multiple surrogates with comple- 
mentary descriptions, which we refer to as MD surrogate se- 
lection. In the case of deploying a MD-CDN from scratch, the 
design issue of optimal MD surrogate placement also arises. 

A. MD Distribution ( "Coloring ") Across Surrogates 

Previous work on server and CDN surrogate placements [6], 
[7] focus on placing mirrors or replicas, in which the assumption 
is that complete content is stored at each chosen server. For MD- 
CDN, this assumption is invalid because a unit of content (e.g. a 
movie) is divided into multiple complementary descriptions that 
are spread across a number of surrogates. To be specific, for 
MD with two descriptions, in general each surrogate may host 
0, 1, or both descriptions. An important special case is when 
each surrogate hosts one description, which leads to the notion 
of coloring where we assign to each surrogate a particular color 
corresponding to a unique description. The goal of this coloring 
problem is to color the surrogates so that a complete set of de- 
scriptions (e.g. both descriptions for MD with two descriptions) 
are close to every client. While this notion of coloring is use- 
ful, it is also unnecessarily constraining since in general each 
surrogate may host both or neither descriptions. 

B. MD Surrogate Selection 

A number of selection algorithms are possible with vary- 
ing deployment complexity and MDC-biased performance opti- 
mization. Current CDN request (re-)direction mechanisms, such 
as DNS request routing, assume a single path when selecting a 
surrogate for a client. Therefore, they do not consider proper- 
ties of multiple paths, such as disjointness. Nonetheless, these 
mechanisms can be applied in a MD-CDN setting, by simply 
choosing the best TV surrogates with N complementary descrip- 
tions, where best is determined by, e.g., shortest path. More 
sophisticated algorithms optimized for MD-CDN that require 
additional network and systems support are also possible, and 
would improve the performance of MD-CDN. In Section V we 
present two algorithms that take into account specific properties 
of MD and path diversity in the surrogate selection process. 

We evaluated the performance of a combination of surrogate 
coloring and selection algorithms in Section V. 
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C. MD Surrogate Placement 

Surrogate placement is the problem of finding the best loca- 
tions to deploy surrogates to optimize client performance. This 
is also called the "facility location problem" or the "p-center 
problem", and a number of graph theoretic approximation al- 
gorithms have been proposed. Previous work [6], [7] in the SD- 
CDN case places surrogates in order to minimize distances from 
surrogates to clients when servicing requests. For SD-CDN re- 
searchers, the main goal is to come up with practical approaches 
that are deployable in the current Internet infrastructure. How- 
ever, to optimally place surrogates for a MD-CDN is more com- 
plex than for a SD-CDN because there are two potentially op- 
posing objectives: minimize distance from clients to surrogates, 
and maximize path disjointness between multiple surrogates and 
each client. 

Although the ideal MD surrogate placement would account 
for both path distance and diversity between clients and surro- 
gates, current CDN and data center infrastructures are designed 
to optimize only for the former but not the latter. For example, 
Akamai [8] already has more than 10,000 surrogates located at 
the edge of the Internet to reduce client access time to surro- 
gates; Digital Island has data centers at a few well-connected 
places in the world. Therefore, it may not be practical to require 
a MD-CDN to work off a completely different set of surrogates. 
Rather, for ease of deployment and economic reasons, it may be 
highly beneficial for the MD-CDN to leverage existing CDN and 
data center infrastructures to deliver descriptions from multiple 
surrogates to a client. We show in Sections V that MD performs 
well in a SD-optimized CDN, where servers are located either 
at the edge or in the core of the network. 

IV. Multiple Description Coding 
and Path Diversity 

MD coding has been shown to provide improved performance 
in networks with path diversity [1]. This section characterizes 
the performance of MDC in the context of the type of path di- 
versity that can be achieved in a CDN. 

A. Multiple Description Video Coding 

Multiple Description Coding (MDC) refers to a form of com- 
pression where a signal is coded into a number of separate bit- 
streams, where the multiple bitstreams are referred to as multi- 
ple descriptions (MD). These multiple descriptions provide two 
important properties. First, each description can be decoded in- 
dependently to give a usable reproduction of the original signal. 
Second, the multiple descriptions contain complementary infor- 
mation so that the quality of the decoded signal improves with 
the number of descriptions that are correctly received. 

An important point is that each description or MD bitstream is 
independent of each other and is typically of roughly equal im- 
portance. This is in contrast to conventional layered or scalable 
schemes. Layered or scalable approaches essentially prioritize 
data and thereby support intelligent discarding of the data (the 
enhancement data can be lost or discarded while still maintain- 
ing usable video). However the base-layer bitstream is critically 
important - if it is lost then the other bitstream(s) are useless. 
MD coding overcomes this problem by allowing useful repro- 
duction of the signal when any description is received, and with 
increasing quality when more descriptions are received. 

In the context of path diversity where each path simultane- 



ously carries a different description, the properties of MDC sug- 
gest that a usable quality is maintained whenever any descrip- 
tion is correctly received. Since using multiple paths reduces 
the probability of having simultaneous losses in all the paths, a 
scheme in which MDC is used with multiple paths improves the 
chance of receiving at least a usable quality of video. 

A number of MD video coding algorithms have recently been 
developed, which provide different tradeoffs in terms of com- 
pression performance and error resilience [9], [10], [1 1], [12]. In 
this paper we base our work on the MD video coder presented 
in [12], [1]. Some important characteristics of this coder are: 
(1) high compression efficiency, achieving MDC properties with 
only slightly higher total bit rate than conventional SD compres- 
sion schemes, (2) ability to use correctly received descriptions 
to repair corrupted descriptions over time, (3) ability to success- 
fully operate over paths that support different or unbalanced bit 
rates [13], and (4) standard compatibility, with this MD coder 
being a standard-compatible enhancement to MPEG-4 Version 
2 (with NEWPRED) and H.263 Version 2 (with RPS). A con- 
sequence of (4) is that any MPEG-4 V2 decoder can decode the 
MD bitstream while an enhanced decoder designed to perform 
state recovery as presented in [12] can provide improved error 
recovery. In addition, this form of MD video coding contains 
conventional (SD) coding as a special case, thereby enabling an 
encoder to adapt its processing between SD and MD based on 
the current communication characteristics. 

As discussed before, a general MD coder is designed to oper- 
ate assuming at least one description is correctly received. This 
assumption can be quite restrictive, i.e. over the duration of a 
video session both descriptions will generally be partially re- 
ceived and partially corrupted. One notable benefit of our se- 
lected MD coder is that it allows repair of corrupted descrip- 
tions using uncorrupted descriptions so that usable quality can 
be maintained even when there are losses in all descriptions, as 
long as the losses do not simultaneously afflict all descriptions. 

B. Loss Characteristics ofSD and MD Video Streams 

This section examines the MD and conventional SD perfor- 
mance for streaming video test sequences over a lossy packet 
network. Specifically, the effect of single and burst losses are 
examined in the SD and MD contexts; and in the MD context, 
the effect of losses in one and both network paths are examined. 

Two test sequences were used. Foreman is a head-and- 
shoulders type sequence similar to a videoconferencing appli- 
cation (144 x 176 pixels/frame at 30 frames/sec) while Bus is 
a more complicated sequence similar to a conventional movie 
(240 x 352 pixels/frame at 30 frames/sec). The MD coder coded 
each sequence into two descriptions, corresponding to the even 
and odd frames. The SD and MD video coding algorithms were 
based on the MPEG-4/H.263-like coder described in Section IV- 
A. To make an appropriate comparison, the sequences were 
coded with MD and SD at the same constant video quality and 
the same total bitrate (bits/sec). Each coder uses a different ap- 
proach for error-resilience: MD via the MD properties, while 
SD devotes extra bits for additional intraframe coding to enable 
it to recover faster from losses. For simplicity, we assume that 
each packet loss results in the loss of an entire frame. This as- 
sumption is appropriate for the Foreman sequence, where an en- 
tire predictively coded frame fits within a single packet, however 
it is a worst-case assumption for the Bus sequence, where a pre- 
dictively coded frame typically requires about 5 packets. Details 
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of the specific comparisons are given in [1]. 
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Fig. 2. Recovered SD and MD video quality for the Foreman (left) and Bus 
(right) sequences in single and burst losses in one and both channels. 

Figure 2 illustrates the performance for MD and SD video 
coding under three types of losses: (1) single loss correspond- 
ing to the loss of a single entire frame (frame 10), (2) two burst 
losses of 100 ms duration, spaced apart by 2/3 sec, which cor- 
responds to the loss of three frames in two locations spaced 
apart by 2/3 sec (starting at frame 10 and frame 30 and af- 
flicting both MD streams), and (3) simultaneous losses in both 
streams. Specifically, in case (2) three frames are lost in the 
even sequence starting at frame 10, and three frames are lost in 
the odd sequence starting at frame 31 (2/3 sec later). Distortion 
is measured in terms of mean-squared error (MSE), and a lower 
distortion (MSE) indicates better quality. 

We make the following conclusions about SD and MD perfor- 
mance in the face of packet loss. For a single loss (top row), the 
SD error is characterized by an initial jump in distortion and a 
gradual recovery. The MD error is characterized by a very small 
jump in the corresponding affected even or odd subsequence. 
The smaller jump in distortion for MD is because the correctly 
received neighboring frames are used in this form of MD coding 
to perform state recovery to accurately recover the lost frame. 

For a burst loss (middle row), the SD error is characterized 
by a large jump for each consecutive packet loss and a grad- 
ual recovery. For a burst loss in one MD path, the MD error is 
similar to that of a single loss; consecutive losses do not result 
in accumulated distortion because the state recovery at the de- 
coder can recover using correctly received neighboring frames. 
For both SD and MD, losses spaced far enough apart behave as 
independent losses. 

For a burst loss (bottom row), the SD error is once again char- 
acterized by a large jump for each consecutive error and a grad- 



ual recovery. For simultaneous losses in both MD paths, the 
error is characterized by a jump in distortion for the even and 
odd subsequences, and each gradually recovers. Note that the 
MD rate of recovery is slower than that of SD because MD cod- 
ing uses less intraframe coding (given the same total bit rate 
constraint). 

MD coding is more resilient to single losses and burst losses 
than SD coding as long as the losses afflict only one channel at 
a time. In this case, the correctly received channel can be used 
to recover the corrupted channel with state recovery techniques 
as described in [12], [1]. Because of this bootstrap off of the 
correctly received channel, MD coding is largely immune to the 
duration of loss in one channel. In the case of simultaneous er- 
rors affecting both channels, SD recovers more quickly because 
of the extra intraframe coding that can be used. 

We quantify the distortion for SD through 7 distortion pa- 
rameters: distortion for (1) no loss, (2) loss of one frame, (3) 
recovery after loss of one frame, (4) loss of a second frame, (5) 
recovery after loss of second frame, (6) loss of a third frame, (7) 
recovery after loss of a third frame. We assume that the distor- 
tion saturates for burst loss length larger than 3. The distortion 
for MD is quantified with 5 distortion parameters (assuming bal- 
anced or symmetric MD): distortion of one description for ( 1 ) no 
loss, (2) loss of one frame (affecting only one description), (3) 
recovery after loss, (4) simultaneous loss of both descriptions, 
(5) recovery after simultaneous loss. Note that it is unnecessary 
to account for burst length in MD since it is largely immune to 
(independent of) the length of the loss as long as the loss afflicts 
only a single description at any point in time. 

C. Modeling Loss Characteristics ofSD and MD Streams 

This section describes models for comparing SD and MD 
video delivery quality as a function of path characteristics and 
losses. The distortion metric is the mean-square error (MSE) 
in the reconstructed video at the decoder. As discussed in the 
prior section, a number of different types of losses afflict con- 
ventional SD and MD video in important and different ways 
and therefore must be accounted for. These events include iso- 
lated packet loss, burst loss as well as the specific length of the 
burst, and in addition for MD whether the loss afflicts only a 
single description at any point in time or simultaneously afflicts 
both descriptions. To distinguish between these different loss 
events requires a model that can express burst loss and further- 
more can capture and distinguish between the losses that occur 
on joint and disjoint links. In the following, we present models 
of the end-to-end loss processes for single and multiple paths, 
and corresponding distortion models that map loss events for 
SD and MD to actual distortions. Specifically, the distortion 
models capture the important loss events described above, and 
the model for the end-to-end loss process for two-path path di- 
versity accounts for both joint and disjoint links. 

To model and characterize the performance of MD and SD de- 
livery over a simulated network, we introduce two simplifying 
assumptions. First, given a network with a number of links, we 
assume that the burst loss behavior of each link can be modeled 
by a two-state Gilbert model parameterized by transition prob- 
abilities {po, go}, where p 0 is the probability of going from no 
loss (0) to loss (1) and qo is the probability of going from loss (1) 
to no loss (0). The Gilbert model is widely used to model bursty 
traffic for its simplicity and mathematical tractability. While 
prior work modeled end-to-end packet loss across a single path 
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in the Internet [14], [15], we propose single link models which 
are then used to develop end-to-end loss models. Second, we 
also assume that each link can be modeled as independent. 

A path is modeled as the concatenation of a number of bursty 
single links. When each bursty single link in a path of N links 
is modeled as a Gilbert model, it can be shown that the end- 
to-end probability of loss and loss runlength can be captured 
by a single Gilbert model parameterized by a different set of 
transition probabilities {p to tai, qtotai}- 

For SD delivery over a single path of N sing i e links, the end-to- 
end loss characteristics is modeled by a two- state Gilbert model 
whose parameters depend on the number of links (path length 
N S i nf ,i e ) and the parameters for each link. This model expresses 
the loss process for the path but not the distortion when video is 
transmitted over that path. One distortion model for SD video 
over a single path is the 4-state model shown in Figure 3, where 
the states denote the number of consecutive losses in the imme- 
diate past. The distortion for bursts of length longer than 3 is 
approximated by that of a length 3 burst. Note that the tran- 
sition probabilities in the distortion model are determined by 
the parameters of the Gilbert model for the path only, while the 
distortion associated with the state transitions is a function of 
the video source only. The 7 SD distortion parameters quantify 
the distortion for each of the transitions. Given this distortion 
model, the average distortion for a particular source and path 
can be easily computed using the stationary distribution of the 
states. 

MD with two descriptions and two paths from a client to two 
servers is much more complex than SD over a single path, as 
the client and servers can be connected through a wide range 
of different topologies, and different links may be joint (shared) 
by both paths while other links may be disjoint (not shared). 
However, it can be shown [16] that to capture the desired end- 
to-end characteristics, we do not need to distinguish based on 
the specific topology. Instead we can summarize the path di- 
versity to a given client simply in terms of three subpaths and 
the parameters corresponding to the lengths of these subpaths: 
(1) disjoint links along the first path, (2) joint links along the 
first and second paths, and (3) disjoint links along the second 
path. Therefore, the loss process for two-path path diversity 
from a MD-CDN to a given client can be expressed by the triplet 
{NDisjoint-hN Jo i nty N D i S j oint .2}, where the total number of links 
in the first path is N Dis j oint .j + N Joinh and the total number of 
links in the second path is N Joint + N Dis j 0 i nt . 2 . In conclusion, 
we do not need to distinguish based on the specific topology, 
and instead can summarize the path diversity via three subpaths, 
each modeled by a two- state Gilbert model which corresponds 
to the concatenation of multiple (bursty) single links of that sub- 
path. While this system may be modeled with an 8-state model, 
the Cartesian product of the three two-state Gilbert sub-paths, 
the need to distinguish the losses that afflict each description 
in the joint subpath and the dependencies between these losses 
requires a 4-state model for the joint subpath. In addition, the 
packet rate (packets/sec) for each joint or disjoint link must be 
appropriately accounted for in terms of its Gilbert parameters. 
In summary, the loss process for two-path path diversity can be 
modeled with a 16-state model and a corresponding 16x16 state 
transition matrix that expresses the transition probabilities from 
one time instant to the next. 

To model MD application-level quality, we map the above 
model, which expresses the loss process for two-path path di- 



versity, to an application-level model which expresses the end- 
to-end distortion behavior of both descriptions sent over their 
respective paths. It is clear from Figure 2 that the distortion for 
MD video, unlike for SD video, depends critically on whether 
loss afflicts both descriptions at the same time, rather than the 
burst loss length on any single description. Therefore, an ap- 
propriate model that captures the distortion behavior of an MD 
source is the 4-state model in Figure 4, which expresses at each 
point in time whether both descriptions are correctly received 
(state 00), one description is correctly received and one descrip- 
tion is afflicted by losses (states 01 and 10) and both descrip- 
tions are simultaneously afflicted by losses (state 11). Specifi- 
cally, the 16-state path diversity model is mapped to the 4-state 
application-layer (source) model, where each of the 16 possible 
transitions corresponds to a different loss event and a different 
distortion in the reconstructed video. Each of the 16 transition 
probabilities corresponds to the sum of a subset of the 256 tran- 
sition probabilities in the 16 x 16 state transition matrix of the 
two-path path diversity loss process. The expected MD distor- 
tion is computed based on the 4-state model where the distortion 
for each transition is quantified by a different combination of the 
5 MD distortion parameters. Specifically, the total expected dis- 
tortion is given by the sum of the products of the steady state 
probability for each state times the transition probability out of 
that state times the distortion that results from that transition. It 
is useful to note that the proposed loss model for path diversity 
may also be useful for other applications not related to MD cod- 
ing. Similarly, while the specifics of this MD distortion model 
were chosen to accurately represent our MD coder, other forms 
of MD coding may be analyzed using a very similar model. 

For convenience of simulation, we assume each link is identi- 
cal and parameterized by Gilbert parameters {po><7o}* There- 
fore, given a topology, for our simulations we construct the 
Gilbert parameters for each link to produce end-to-end charac- 
teristics similar to those measured in the Internet. 

To summarize, assuming all links are identical, the expected 
distortion (MSE) for SD is given by D S D(N sing i e ,pQ, q 0 ) and that 
for MD by D M D{N D i S j 0 i nt .i,N Joint , N Dis j oint . 2 ,po y qo). 

As an example, Figure 5 illustrates the relative performance 
of MD and SD when a client is connected via a symmetric 
"Y" (N Disjoint .i = N Dis j 0int -2) and we vary the fraction of the 
total number of links that are joint and disjoint. Specifically, 
the total length of each path is 8 links, and the number of 
joint links is varied from 0 to 8 and the number of disjoint 
links therefore varies from 8 to 0. For this plot we assumed 
{Po> <7o} = { 0052, .8} for each link, where the po corresponds 
to 5 % end-to-end average packet loss for 8 links, and q 0 = .8 
corresponds to the longest average burst length (assuming 30 
msec sampling) that we are aware of in the literature [14], [15]. 




p 1-q 1-q 



U drop1 U drop2 u drop3 

Fig. 3. Expected SD video quality is estimated from this model, where the 
four states identify the burst length and where the transition probabilities are 
labeled as well as the distortions that result for those transitions. 
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Fig. 4. Expected MD video quality is estimated from this model, where the four 
states identify at any instant in time whether any of the two descriptions 
are currently afflicted by losses. Each of the 16 different transition arcs 
corresponds to a different distortion (only 4 are labeled). 

MO and SD dtstortion (For) MD and SD distortion (Bus) 




Number of Joint Links out of a Total of 8 Links Number of Joint Links out of a Total of 8 Links 



Fig. 5. MD versus SD distortion for a symmetric "Y*\ as we vary the number 
of joint links given that the total number of links is 8. MD provides less 
distortion than SD for all cases of Foreman (right) and almost all cases for 
Bus (left) except where the paths are almost completely joint. 

V. Simulation Experiments 

We conducted simulation experiments to study how an MD- 
CDN behaves under various conditions. The goal of the ex- 
periments is to examine two questions. First, we investigated 
whether and how much a MD-CDN is able to yield perfor- 
mance improvement over SD-CDN, while leveraging only exist- 
ing server infrastructures such as CDN and Internet Data Cen- 
ters (IDC). Second, we studied the sensitivity of MD-CDN to 
MD-optimized algorithms which require information not avail- 
able in SD-CDN. We also examined the path diversity charac- 
teristics of a number of real and generated network topologies. 

A. Methodology 

In our simulation experiments, we placed servers on both gen- 
erated and existing network topologies, and then colored them 
with different MD streams. A pair of servers hosting comple- 
mentary descriptions is selected for each client request. For 
both SD-CDN and MD-CDN, the collective performance across 
all clients for each network topology is evaluated. We varied a 
number of parameters in our experiments- topology, placement, 
coloring and selection algorithms- which are discussed next. 

A.l Topology 

To determine how MD-CDN fares in different networks, we 
examined in our experiments a number of different topolo- 
gies which are listed with their characteristics in Table I. The 
AT&T and UUNet ISP backbone graphs are available from 
CAIDA [17]. The AS graph, from NLANR [18], corresponds 
to connectivity among Internet autonomous systems (AS) where 
each node in the graph represents an AS. We also examined gen- 
erated topologies created by the BRITE [19] topology generator 



Name 


Type 


Date 


# Nodes 


# Edges 


AT&T 


ISP 


2000 


87 


195 


UUNet 


ISP 


2001 


113 


1078 


AS 


Inter-AS 


1999 


4830 


9078 


BRITE-h 


Generated 


NA 


1000 


1987 


BRITE-f 


Generated 


NA 


1000 


1997 



TABLE I 
Topologies. 



from Boston University. BRITE models incremental growth and 
preferential connectivity in networks [20], [21], which are possi- 
ble causes for power-laws observed in Internet topologies [22]. 
Using BRITE, we created a two-level hierarchical and a one- 
level flat topology that models the Internet. 

A.2 MD Surrogate Placement Algorithms 

We used the following placement algorithms to place servers 
on a subset of nodes in a topology: 

• Edge: To emulate surrogate placement in a CDN, we place 
servers at the edge of a topology, which we define as nodes with 
degree of two to three. If there are more candidate edge nodes 
than desired number of servers, a random tie breaker is used. 

• Core: To emulate data center placement at the most connected 
part of a network, we place servers at the core of a topology, 
which we define as nodes with the highest degree. Again, a 
random tie breaker is used to select among multiple nodes with 
the same degree. 

• IDC: For some ISP graphs, the location of Internet Data Cen- 
ters (IDC) are available. The IDC locations of the AT&T IP 
backbone are available [23], which we use to place servers for 
the AT&T topology in our experiments. This emulates "hotspot" 
placement where client population is most concentrated. 

We used these simple placement algorithms in the exper- 
iments to examine whether MD-CDN can leverage existing 
server infrastructures that are not optimized for MD. All the al- 
gorithms listed above are biased towards SD, such that the dis- 
tance from servers to clients (Edge) or from servers to servers 
(Core and IDC) are minimized. While the ideal case is to use a 
real surrogate location graph from a CDN company, such infor- 
mation is proprietary and not available. 

A. 3 MD Distribution Across Surrogates Algorithms 

Given a placement of servers, an important question is how 
to distribute the MD descriptions across the surrogates. In gen- 
eral each surrogate may host 0, 1, or both descriptions. In the 
following we use the conceptually useful, though suboptimal, 
notion of coloring, where each surrogate is assigned a single de- 
scription. To compare SD-CDN and MD-CDN performance, we 
instrumented the following coloring algorithms to create one SD 
and two MD scenarios: 

• SD: The SD algorithm randomly selects half of the available 
servers, and places SD at each server. This is to model SD-CDN 
in which each server stores the full content. We also use SD as 
the baseline algorithm to compare with MD-based approaches. 

• MD-half : On the same half of the servers selected by SD, the 
MD-half algorithm places both descriptions at each server. As 
explained in Section IV, we encode the Bus and Foreman video 
sequences into two descriptions such that the resulting total bit 
rate equals that of the SD stream. Hence, the MD-half algorithm 
imposes the constraints that SD and MD use the same servers, 
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and also use the same total amount of (1) storage in the infras- 
tructure and (2) bandwidth to the clients. 

• MD-all: The MD-all algorithm randomly places one of the 
two descriptions at each and every available server, with the con- 
dition that at least one server is assigned to each color. Here, we 
remove the constraint that SD and MD use the same servers, 
however the total storage used in the infrastructure remains the 
same for SD and MD, as well as bandwidth to the clients. 

The motivation for using these coloring algorithms in the sim- 
ulation experiments is to examine the hypothesis that placing 
only one description on each server, but using twice as many 
servers, could provide improved path diversity than the con- 
ventional approach where each server would store the complete 
video or both descriptions. 

A.4 MD Surrogate Selection Algorithms 

Given placement and coloring of servers with the above algo- 
rithms, the surrogate selection problem addresses how to select 
for each client the optimal pair of multiple surrogates with com- 
plementary descriptions while accounting for path lengths and 
path jointness and disjointness. Conventional CDN selection 
assume a single surrogate over a single path, and select the best 
surrogate based on, for example, shortest path. This may be 
extended to an MD-CDN by selecting the two surrogates with 
shortest paths, however this does not consider the jointness or 
disjointness of the paths. More sophisticated algorithms that 
take into account specific properties of MD and path diversity 
in the surrogate selection process can provide improved perfor- 
mance. To evaluate these benefits we instrumented the following 
selection algorithms in our simulation: 

• Shortest Path (SP): Pick the two closest servers (with different 
descriptions) to the client. We measure closeness by hop counts. 
If more than one server has the same shortest path distance, a tie 
breaker is chosen randomly. 

• Heuristic: For each pair of servers Si , Sj with complementary 
descriptions, we calculate a score using the equation, ( pi + p ?) + 
NjoinUj » where pi is the path length in hop counts from 5, 
to the client (i.e. pi = Nj oint . , + N Disjoint i\ and Pj from 
Sj to the client, and where Nj 0 i nti j is the path length of the 
joint portion of the two paths. This heuristic algorithm aims to 
minimize joint path and total path lengths between a client and 
its two servers. Note that we assume the two servers are unique 
(i ^ j), i.e. even if a server is close to the client and has both 
descriptions, it will not stream both MD streams to the client. 

• Distortion: For each pair of servers with complementary de- 
scriptions, we calculate the expected distortion for a client using 
the method described in Section IV-C. The pair of servers with 
the lowest estimated distortion is then chosen for the client. 

The above algorithms are ordered according to increasing de- 
ployment complexity and MD-biased optimization. SP is analo- 
gous to current CDN request direction using DNS, and does not 
consider path characteristics such as disjointness. Heuristic re- 
quires knowledge of paths between each client-server pair, thus 
may demand either dynamic network support or static topology 
snapshots at the selection algorithm. Distortion requires dis- 
tortion parameters for each stream, in addition to knowledge of 
server-client paths. However given this knowledge, our analyt- 
ical models enable us to determine the optimal pair of surro- 
gates for each client, in terms of minimizing the expected dis- 
tortion. Note that this selection problem is particularly impor- 
tant, as it must be solved every time any client requests any con- 
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AT&T 
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10 
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Summary of experiments we conducted. "*" denotes all 

ALGORITHMS ARE EVALUATED. 



tent. We instrumented these algorithms to evaluate the benefits 
to MD-CDN of information that is not necessary in SD-CDN. 
This gives us intuition on the incremental deployment issues of 
MD-CDN. 

A. 5 Packet Loss Model 

We used the Gilbert loss model developed in Section IV to 
simulate packet losses in our experiments. We fixed q = 0.8 
which corresponds to an expected burst loss length of 1.25; 
studies [14], [15] have shown that consecutive losses (loss run- 
lengths) are short and rarely last more than four packets, and this 
q corresponds to the longest average burst length measurement 
that we are aware of. We chose p to yield a moderate end-to-end 
loss rate of 5% for an average path length of five or eight hops 
(depending on topology). 

Table II summarizes the experiments we conducted. 

5. Simulation Results 

B. 1 MD-CDN Performance in SD-biased Environments 

The first question we asked in our simulation experiments is 
whether and how much MD-CDN yield improvements over SD- 
CDN, while leveraging only existing CDN and IDC infrastruc- 
tures. In particular, given servers that are placed to minimize 
distance to clients, or servers that are located in the core of the 
network, is there enough path diversity in such environments 
that MD can utilize to reduce distortions at the clients. We also 
assumed in this part of the experiment only simple network sup- 
port is available. To direct client requests to servers, we simply 
find the two servers with the shortest paths to the client. As dis- 
cussed in the methodology section, we evaluated one SD sce- 
nario, and two MD approaches where in the first both descrip- 
tions are stored in half of the available servers (MD-half), and 
in the second one description is stored in every server (MD-all). 

Figure 6 are results for the BRITE-h, AS and ATT graphs. 
Because of limited space, we do not show plots for all topolo- 
gies we examined. For each topology, we calculated the fol- 
lowing for each of the SD and MD scenarios. First, cumulative 
distribution of distortion for clients. This shows us the general 
performance of MD-CDN over SD-CDN-specifically whether 
MD-CDN yields lower distortion for the clients. Second, we 
calculated the mean and standard deviation of the distortions 
over all clients in a topology. Third, we drew a histogram of 
the reduction in distortion achieved at each client if MD-CDN 
(MD-all) is used instead of SD-CDN. This part of the experi- 
ment is based on the Bus video sequence which is a complicated 
sequence and for which MD and SD have relatively close per- 
formance. For the Foreman sequence MD provides significantly 
better performance than SD, and we do not include those results. 

From the cumulative distribution plots of Figure 6, we see 
that in general MD-CDN outperforms SD-CDN, even when the 
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6. Top row: Cumulative distribution of distortion at the clients for SD and the two MD scenarios for BRITE-h (left column), AS (middle column), and ATT 
(right column). Middle row: Average distortion and its standard deviation for each scenario. Bottom row: Histograms of the percentage reduction in distortion 
when MD is used over SD. We observe that both MD-half and MD-all outperform SD in terms of lower expected distortion at the clients in all three networks. 



placement, coloring and selection algorithms are optimized to- 
wards SD. To state specific numbers, for the BRITE-h topology, 
about 13% of the clients experience distortion of 250 or less if 
SD is used to deliver the Bus sequence, but this is true for more 
than 40% to 60% of the clients for both MD scenarios. We ob- 
serve similar results for the rest of the topologies. 

The mean and standard deviation plots and the histogram 
plots give us a better idea of how much MD-CDN reduces distor- 
tion over SD-CDN. We see that for all topologies evaluated, the 
average distortion for both MD scenarios are lower than SD. For 
example, in the AT&T ISP backbone topology, the average is 
approximately between 160 to 180 for MD, whereas SD clients 
would experience an average distortion of over 250. We also 
observe that, if we utilize all available servers but only store one 
description on each (MD-all), the performance is slightly bet- 
ter than if we store both descriptions in only half of the servers 
(MD-half). We found that the average client-server path length 
is shorter in MD-all than MD-half, which follows intuition be- 
cause more servers are available in MD-all to service the same 
number of clients, thus MD-all is able to provide lower distor- 
tion numbers. This reinforces our proposal of spreading MD 
over servers, instead of storing all descriptions for a video se- 
quence on each server. 

To dig a bit deeper, the histogram plots quantify the improve- 
ment of MD-CDN over SD-CDN. We compare the MD-all sce- 
nario to SD since we found that MD-all is slightly better than 



MD-half. The reduction in distortion for each client is calcu- 
lated as ]MSE mse s S d EmD ^ where mse sd is the distortion 
for SD, and MSEmd is the distortion for MD. The histogram 
plots show that, for certain topologies such as AS and AT&T, 
most clients see a 40% distortion reduction. We suspect that 
the different results arise from the different characteristics of 
the topologies. For example, AS in particular is well connected, 
with a few node degrees of over 100, thus yielding short but 
diverse routers between nodes. 

To compare and contrast MD-CDN and SD-CDN from an- 
other viewpoint, we examine the number of servers necessary 
to achieve an average distortion at the clients for both schemes. 
Figures 7 illustrates the reduction in distortion at the clients as 
we increase the percentage of nodes acting as servers. We ob- 
serve that MD streaming requires fewer servers than SD stream- 
ing to achieve the same average distortion. For example, Figure 
7(a) shows that in the BRITE-f topology, SD-CDN would need 
50% of the nodes acting as servers to achieve an average distor- 
tion of 150, whereas it only takes MD-CDN about 30%. We see 
similar results for other topologies. To get an average distortion 
of 170 in the AS topology, SD streaming would need approxi- 
mately 40% of its nodes acting as servers, whereas MD about 
15%. Also, the variance of distortion at the clients is smaller in 
the case of MD streaming, as illustrated by the errorbars. 

To summarize, this part of the experiment shows that MD 
streaming performs better than SD streaming in existing CDN 



0-7803-7477-0/02/$17.00 (C) 2002 IEEE 



IEEE INFOCOM 2002 



SD Distortion for BRITE-f 



MD Distortion for BRITE-f 




Benefits of Joint/Disjoint Link Knowledge 




Percentage of Nodes as Servers 



(a) BRITE-f 



SD Distortion for UUNet 



Percentage of Nodes as Servers 



MD Distortion for UUNet 




Percentage of Nodes as Servers 



(b) UUNet 



SD Distortion for AS 



Percentage of Nodes as Servers 



MD Distortion for AS 




Percentage of Nodes as Servers 



(c) AS 



Percentage of Nodes as Servers 



Fig. 7. Average distortion at the clients versus percentage of nodes acting as 
servers: MD (right) requires fewer servers than SD (left) to achieve a certain 
distortion, and also provides a lower variance in distortion over the clients. 



and IDC infrastructures-without using MD-optimized server 
placement, coloring and selection that consider path characteris- 
tics. MD streaming also requires fewer servers than SD stream- 
ing to achieve the same average distortion at clients. 

B.2 Benefits of Joint/Disjoint Link Knowledge 

We have shown above that MD-CDN outperforms SD-CDN 
with SD-biased placement, coloring and selection algorithms. In 
the following, we investigate the additional improvement MD- 
CDN provides over SD-CDN given joint/disjoint path informa- 
tion (which is not necessary in SD-CDN). We assume that all 
links are identical. Specifically, we compare the Distortion se- 
lection algorithm to direct a client to appropriate servers, which 
assumes knowledge of joint and disjoint links between each 
client-server pair, to the Shortest Path (SP) algorithm which 
only requires knowledge of path length. Figure 8 compares the 
reduction in distortion for each algorithm. These specific test 
conditions favor shortest path selection (for identical links, the 
problem largely reduces to minimizing path lengths) and knowl- 
edge of joint/disjoint links provides marginal additional gain 
over simply selecting the closest two distinct servers. However, 
in the more typical case where different links have different loss 
characteristics, the use of this information may provide signifi- 
cant improvement over a selection based solely on path length. 



Reduction in distortion (%) 



Fig. 8. Improvements of MD-CDN over SD-CDN when joint/disjoint link in- 
formation is (1) known, and (2) not known. For each topology, there are two 
lines: the dotted line below denotes the average and one standard deviation 
above and below the mean of reduction in distortion when joint/disjoint link 
information is known, and the solid line above is for when it is not known. 
All links within a topology are assumed identical. 



B.3 Correlation of Disjointness Ratios 

It is also interesting to investigate the disjointness of paths 
between a client and its MD servers in the various topologies. 
Figure 9 illustrates this correlation for BRITE-h, AS and UUNet 
graphs. The x-axis denotes disjointness ratio on the first path, 
and y-axis the second path, where disjointness ratio is given by 
where N Disjoint is the disjoint path length and p is the 



total path length. In other words, the larger the disjointness ratio, 
the more disjoint is the path. A dot in the scatterplot located at 
x = dl, y = d2 means there is a client with disjointness ratio dl 
on the first path and d2 on the second path. For each topology, 
we calculated the disjointness ratio for each client to its two MD- 
all colored surrogates, selected with the Distortion algorithm. 
We make a number of observations. For all topologies, there are 
only a few dots in the upper left-hand and the lower right-hand 
regions in the scatterplots. A dot in one of these regions signifies 
asymmetric disjointness ratios, which may arise when a client 
is located very close to (or co-located with) a server, thus the 
disjointness ratio on that path is very small (or close to zero). In 
general, however, most of the dots lie on or around the diagonal, 
thus disjointness is more symmetrical than asymmetrical. 

VI. Additional Related Work 

This paper is based on applying MD coding and path diversity 
in the context of CDN. The idea of using diversity over packet 
networks in not new, however it has received relatively little at- 
tention, where Dispersity Routing by Maxemchuk [24] is one of 
the first works, and [25] is a more recent example. The approach 
of this paper is to leverage the CDN surrogate infrastructure to 
provide multiple paths, without requiring explicit path diversity 
support from the network. 

In prior work [1], [13], MD and path diversity was shown to 
provide improved performance for point-to-point communica- 
tion over lossy packet networks, when diversity was achieved 
using either a relay infrastructure or source-based routing. The 
idea of using path diversity for point-to-point video/image appli- 
cations is also proposed by [26], [27] for mobile multihop radio 
environments, where an MD image coder is used to code each 
frame into two descriptions based on a checker-board pattern, 
and recent extensions to video over ad-hoc wireless networks is 
considered in [28]. In the recent work [29] it is further shown 
that path diversity can improve latency and loss characteristics 
for real-time voice communication over the Internet by exploit- 
ing the different delay variations along different paths. 
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Fig. 9. Correlation of disjointness ratios on client-server paths. Each dot in the scatterplot represents a client in the topology. We add small "jitter" to the dots 
to help distinguish them clearly. We observe that disjointness of the two client-server paths are somewhat symmetrical, and the upper left-hand and lower 
right-hand regions which denote severe disjointness asymmetry are largely unoccupied. 



Digital Fountain [30] applies Tornado codes to achieve reli- 
able data download. Their subsequent work [31] reduces down- 
load times by having a client receive a Tornado encoded file 
from multiple mirror servers. The target application of their ap- 
proach is bulk data transfer, not real-time video. On the other 
hand, our paper focus on streaming media with MDC. 

Another interesting, though more distant work, is Resilient 
Overlay Networks (RON) which provide resilience to network 
failures by using an overlay to re-route around failures [32]. 

VII. Summary 

The combination of multiple description coding and path di- 
versity provide improved error-resilience for streaming media 
over best-effort networks. In this work, we use CDNs to explic- 
itly provide multiple paths over which to deliver complementary 
descriptions from different edge servers to a single client. We 
show how path diversity can be achieved with CDNs and present 
models for estimating the performance of MD and path diver- 
sity. We examine the surrogate coloring and selection problems 
for MD-CDNs, and propose an algorithm to select the optimal 
pair of surrogates hosting complementary descriptions for each 
client. We believe the proposed metric for evaluating MD and 
path diversity performance for a client-server pair is a key com- 
ponent for designing more sophisticated algorithms for MD sur- 
rogate placement and MD distribution across surrogates ("color- 
ing"). We conducted simulation experiments on various topolo- 
gies and settings, and found that MD streaming performs bet- 
ter than SD in existing infrastructures-without MD-optimized 
server placement, coloring, or selection algorithms-thereby re- 
ducing distortion at the clients for the same number of surro- 
gates, or reducing the required number of surrogates to achieve 
a desired distortion. In summary, our results show that coupling 
MD coding with path diversity from a CDN can provide signifi- 
cant performance benefits over a conventional SD-CDN. 
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